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ABSTRACT 

This  paper  investigates  some  properties  of  applying  Probit  and  ID3  methods  to 
the  analysis  of  accounting  classification  problems.  The  particular  accounting  problem 
examined  is  the  LIFO/FIFO  choice.  Both  original  and  hold- out  samples  are  used  to 
study  the  effects  of  the  training  sample  size  and  the  nature  of  the  data  set  on  the 
accuracy  of  classification.  The  results  indicate  that  (1)  Probit  and  ID3  identify  different 
factors  that  affect  LIFO/FIFO  choice;'  (2)  in  hold-out  tests,  ID3  performs  better  when 
the  sample  size  of  the  input  data  set  is  small  relative  to  the  total  population;  whereas 
Probit  performs  better  when  the  sample  size  is  relatively  large;  and  (3)  ID3  performs 
better  when  the  input  data  set  is  dominated  by  nominal  variables;  whereas  Probit 
performs  better  otherwise. 
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1.  INTRODUCTION 

In  the  past  decade,  Probit  has  been  one  of  the  primary  methods  in  studying 
accounting  classification  problems  such  as  LIFO/FIFO  choices  or  bankruptcy  prediction 
(e.g.,  Dopuch  and  Pincus  198S;  Hagerman  and  Zmijewski  1979;  Lee  and  Hsieh  1985). 
Although  Probit  has  been  argued  to  be  theoretically  superior  to  both  multivariate 
discriminant  analysis  (MDA)  and  ordinary  least  square  regression  (e.g.,  Dietrich  and 
Kaplan  1982:  Ohlson  19S0)1  in  classification  research,  limitations  exist  when  nominal 
variables  are  involved.  In  this  case,  dummy  variables  must  be  used  to  represent 
different  values  of  the  nominal  variables,  which  may  result  in  a  violation  of  the 
normality  assumption  that  the  relationship  between  the  dependent  variable  and 
independent  variables  is  a  cumulative  normal  distribution  function  (Aldrich  and  Nelson 
19S4).  In  addition,  the  assumption  that  the  dependent  variable  is  a  linear  function  of 
the  independent  variables  may  be  questionable  when  nominal  variables  exist. 

Recently,  nonparametric  classification  techniques  have  been  considered  as 
alternatives  to  traditional  parametric  methods  in  classification  problems.  For  example. 
Marais.  Patell.  and  Wolfson  (1984)  applied  a  recursive  partitioning  algorithm  (RPA)  to 
commercial  loan  classification  and  found  it  to  be  "a  viable  competitor  to  parametric 
methods  such  as  polytomous  Probit  even  when  the  assumptions  underlying  the 
parametric  model  are  satisfied. "  Frydman,  Altman,  and  Kao  (19S5)  also  report  that 
RPA  outperforms  discriminant  analysis  in  most  original  sample  and  hold-out 
comparisons.  In  addition  to  the  feature  of  making  no  assumption  on  data  distributions, 
non-parametric  methods  usually  derive  a  decision  tree  that  shows  the  interaction  of 
variables.  After  proper  transformation,  decision  rules  suitable  for  developing  expert 
systems  or  rule-based  decision  support  systems  can  be  derived  from  the  decision  tree. 


Counter-arguments  also  exist.  In  a  recent  study,  for  example,  Noreen  (19SS) 
shows  that  (1)  the  rejection  regions  for  the  Probit  test  statistics  are  not  well-specified 
for  small  samples,  and  (2)  the  ordinary  least  square  regression  seems  to  perform  at  least 
as  well  as  Probit  for  the  cases  considered. 
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which  may  make  the  resulting  model  easier  to  use  and  to  understand. 

The  primary  purpose  of  this  paper  is  to  investigate  the  properties  of  another 
nonparametric  algorithm,  the  ID3  method,  in  analyzing  accounting  problems.  ID3 
algorithm  is  an  inductive  learning  technique  that  derives  decision  models  from  data.  It 
originated  from  Hunt,  Martin,  and  Stone's  work  (1966)  on  conceptual  learning  and  was 
later  implemented  and  expanded  by  Quinlan  (1979.  1982).  The  primary  difference 
between  ID3  and  RPA  is  that  the  former  uses  a  criteria  derived  from  information  theory 
to  determine  the  relative  importance  of  independent  variables  and  constructs  decision 
trees  accordingly;  whereas  the  latter  minimizes  the  observed  expected  cost  of 
misclassification.  Recent  studies  on  ID3  have  provided  evidence  that  it  can  outperform 
expert  judgment  and  discriminant  analysis  (e.g.,  Braun  and  Chandler  1987,  Messier  and 
Hensen  1988).  In  this  paper,  we  use  both  original  and  hold-out  samples  to  investigate 
its  sensitivity  to  training  sample  size  and  the  nature  of  the  data  set.  The  particular 
accounting  problem  studied  was  the  LIFO/FIFO  decision. 

Our  empirical  results  include  the  following.  First,  ID3  and  Probit  identify 
different  factors  that  affect  LIFO/FIFO  choice.  This  raises  a  concern  about  the  effect 
of  research  methods  on  the  interpretation  of  research  findings.  Second,  in  hold-out 
tests,  ID3  performs  better  when  the  sample  size  of  the  input  data  set  is  small  relative  to 
the  total  population;  whereas  Probit  performs  better  when  the  sample  size  is  relatively 
large.  Third,  ID3  performs  better  when  the  input  data  set  is  dominated  by  nominal 
variables;  whereas  Probit  performs  better  otherwise. 

The  remainder  of  this  article  is  organized  as  follows.  Section  2  describes  the  ID3 
algorithm.  Section  3  briefly  reviews  some  methodological  issues  in  LIFO/FIFO 
research.  Section  4  discusses  the  first  experiment  that  compares  the  internal  validity  of 
the  models  (i.e.,  the  degree  to  which  the  cases  in  the  data  set  from  which  the  model  was 
derived  are  correctly  classified  by  a  model).    Section  5  presents  the  results  of  the  second 
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experiment  in  which  hold-out  samples  are  used  to  examine  the  external  validity  of  the 

resulting  models  (i.e.,  the  degree  to  which  hold-out  cases  are  correctly  interpreted  by  a 

model).    Section  6  concludes  the  findings  and  discusses  some  implications. 

2.  THE  ID3  ALGORITHM 

The  input  to  ID3  is  a  data  set  consisting  of  observed  data  of  N  cases  (called 
training  sample  data).  For  each  case,  the  input  data  include  its  actual  group 
classification  and  values  associated  with  a  finite  number  of  factors  potentially  affecting 
its  group  classification.  The  function  of  the  algorithm  is  to  induce  a  model  from  the 
observed  data,  which  is  capable  of  identifying  the  relationships  between  the  factors  and 
the  actual  classification.  Instead  of  relying  on  sample  distribution  statistics,  the 
algorithm  uses  entropy  to  measure  the  relative  information  content  attributed  to  each 
factor  and  generates  a  decision  tree  model.  The  factor  with  the  highest  information 
content  is  considered  the  more  important  factor  and  selected  as  the  root  node  of  the 
tree.  Other  factors  are  then  examined  based  on  their  relative  information  content.  In 
this  section,  we  shall  discuss  the  measurement  of  information  content  and  the  model 
construction  process  of  the  ID3  algorithm. 
2.1.  A  Measurement  of  Information  Contents—  Entropy 

Entropy  was  originally  developed  to  measure  the  amount  of  information 
transmitted  in  a  communication  process  (Shannon  and  Weaver  1949).  It  indicates  the 
observational  variety  and  has  a  value  range  from  zero  to  one  (Krippendorff  19S6). 
Entropy  is  zero  when  all  observations  are  of  the  same  kind  (i.e.,  no  variety),  and  is  one 
when  observations  have  equal  opportunities  to  be  classified  as  any  one  of  the  classes 
(i.e.,  maximum  degree  of  variety).  Entropy  assumes  nothing  about  the  nature  of  the 
frequency  or  probability  distribution  and  are,  thus,  nonparametric.  When  applying 
entropy  to  classification  problems,  the  entropy  of  a  variable  shows  the  extent  to  which 
the  accuracy  of  a  classification  can  be  improved  (or  the  uncertainty  can  be  reduced)  by 
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introducing  the  variable.     The  purpose  of  the  ID3  algorithm  is  to  construct  a  decision 

tree  capable  of  classifying  all  cases  in  the  input  data  set. 

Mathematically,  entropy  is  a  logarithmic  function  of  related  frequencies  or 
probabilities.  Consider  a  data  set  of  N  cases,  with  each  case  described  by  a  number  of 
variables  and  a  category.  A  given  variable  X  classifies  the  N  cases  in  k  categories,  C-., 
...,  Ci  ,  and  has  m  values,  Vi,  ...,  Vm.  For  a  particular  value,  V-,  of  X,  there  is  a 
probability  of  p--  that  V-  classifies  a  case  into  class  C-.    The  entropy  of  X  =  V-  is 

H(Vj)  =  -  £    PH  log2  Pj:  (1) 

The  entropy  of  X,  the  weighted  sum  over  all  of  its  m  values,  is 
H(X)  =  g   -§-  H(V;)  (2) 

Where  N-  =  number  of  cases  where  X  =  V-. 
i  l 

For  numerical  variables,  the  calculation  of  entropy  by  ID3  includes  two  steps. 
First,  a  value  is  chosen  to  split  the  range  of  values  for  that  variable  into  two  regions: 
high  and  low.  Second,  the  entropy  of  the  variable  is  computed  based  on  that  split 
value.  This  process  is  performed  for  each  possible  split.  The  value  that  minimizes 
entropy  is  selected  as  the  split  value  for  the  variable.  In  other  words,  for  each  case  W- 
(1  <  i  <  N),  ID3  divides  X  values  into  two  subsets  (V-i,  ..  ,  V-)  and  (V-  ,  -,,  ..  ,  V^), 
which  allows  ID3  to  compute  the  entropy  resulting  from  the  split.  If  the  division  of 
(Vi,  ..  ,V+)  and  (V\  ,  -.,  ..  ,  V^)  has  the  the  lowest  entropy,  then  we  split  the  variable 
at  S,  where 

S  =  (Vt  +  Vt+1)/2  (3) 


Insert  Table  1  Here 
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Table  1,  for  example,  shows  a  set  of  highly  simplified  LIFO/FIFO  data  including 

one  nominal  variable  and  one  integer  variable.    For  the  variable  of  industry  type,  in  the 

lumber  industry,  six  firms  use  FIFO  and  no  firm  uses  LIFO;  while  in  the  metal  industry 

one  firm  uses  FIFO  and  seven  firms  use  LIFO.     Based  on  equations  (1)  and  (2),  the 

entropy  of  the  variable  industry  type  can  be  calculated  in  the  following: 

H(Industry=lumber)  =  -  |  log2|  -  jj  log2Q  =  0 

H(Industry=metal)  =    -  g  log2|  -  1  log2|  =  0.54 

H(Industry)  =  £  *  0  +  ^  *  0.54  =  0.308 

Since  net  sales  is  an  integer  variable,  we  need  to  find  the  split  with  the  minimum 
entropy.  Among  the  thirteen  possible  splits  in  the  example,  the  optimum  split  is  450 
million  which  divides  the  values  of  net  sales  into  two  groups:  (63,  ..  ,  400)  and  (500,  ..  , 
2300).  The  first  group  includes  five  FIFO  firms  and  no  LIFO  firm;  whereas  the  second 
group  includes  two  FIFO  firm  and  seven  LIFO  firms.  Its  entropy  is 
H(Net  sales)  =  -  0  -  £  (|  log2  |  +  1  log2  1  )  =  0.491 

The  values  of  0.308  and  0.491  indicate  the  resulting  varieties  after  introducing 
industry  type  and  net  sales  into  the  classification  model,  respectively. 
2.2.  Model  Construction  Process 

Since  lower  entropy  implies  lower  level  of  variety  and  lower  uncertainty,  the  ID3 
algorithm  considers  the  variable  with  the  lowest  uncertainty  as  the  most  important  one 
and  gives  it  higher  priority  in  constructing  models.  Its  model  construction  process 
begins  with  the  whole  input  data  set  from  which  the  root  node  of  the  classification  tree 
can  be  constructed.  This  includes  several  steps.  First,  the  entropy  of  each  variable  is 
calculated  based  on  the  input  data.  Second,  the  variable  with  the  minimum  entropy  is 
chosen  as  the  root  node  of  the  tree.  If  the  variable  is  nominal  and  has  m  levels,  then 
the  tree  will  have  m  branches  at  the  first  level  and  all  input  cases  will  be  divided  into  m 
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groups  according  to  their  values  of  the  root  variable.    For  numerical  variables,  the  tree 

will  have  two  branches  containing  cases  whose  values  are  higher  than  and  lower  than 

the  split  value,  respectively. 

After  splitting  the  original  cases,  each  of  the  m  groups  of  input  cases  is 
considered  a  separate  data  set.  If  all  cases  in  the  group  are  in  the  same  category,  then 
no  further  analysis  on  the  group  is  needed.  This  indicates  that  the  preceding  variable 
is  capable  of  classifying  those  cases  completely.  The  category  to  which  these  cases 
belongs  becomes  a  leaf  node  of  the  h'<  Otherwise,  entropies  of  the  variables  will  be 
calculated  again  based  on  the  cases  in  the  subgroup  and  the  variable  with  ill-  minimum 
entropy  will  be  attached  to  the  branch.  The  ca>c-s  in  the  moup  will  be  lurther  split 
based  on  the  value  of  the  selected  variable.  This  process  continues  until  no  further 
improvement  is  possible. 

In  the  previous  example,  the  entropy  of  industry  type  is  lower  than  that  of  net 
sales.  Therefore,  the  industry  type  forms  the  root  node,  which  divides  the  firms  into 
lumber  and  metal  groups.  Since  all  firms  in  the  lumber  group  use  FIFO,  no  further 
analysis  is  possible  and  the  leaf  node  of  this  branch  is  FIFO. 

In  the  metal  group,  one  firm  uses  FIFO  and  seven  firms  use  LIFO.  A  further 
analysis  splits  the  net  sales  of  the  firms  into  two  groups:  (500,  ..  ,  1000)  and  (1420,  ..  , 
2300).  The  first  group  includes  one  FIFO  and  two  LIFO  firms,  while  the  second  one 
includes  five  LIFO  firms  and  no  FIFO  firm.  The  split  value  is  1210  and  the  entropy  is 
calculated  to  be  0.344.  Since  the  two  firms  in  the  first  group  are  not  of  the  same  class, 
we  can  further  classify  them  into  two  categories:  firms  with  net  sales  less  than  825 
(LIFO  firms)  and  firms  with  net  sales  higher  than  825  (FIFO  firms).  All  firms  in  the 
second  group  are  LIFO  firms  and  cannot  be  further  decomposed.  The  process  stops 
when  all  firms  in  the  same  group  are  using  the  same  inventory  method.  Figure  1  shows 
the  resulting  decision  tree. 
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Insert  Figure  1  Here 


2.3  A  Comparison  of  ID3  and  Probit 

Probit  method  uses  statistical  inference  procedures  to  derive  a  linear  model  from 
a  set  of  input  data.  The  model  estimates  the  likelihood  that,  given  the  input  data,  the 
case  falls  in  a  particular  class.  It  has  several  assumptions.  First,  the  dependent 
variable  is  categorical.  Second,  the  relationship  between  the  dependent  variable  and  the 
independent  variables  is  a  cumulative  normal  distribution.  Third,  no  two  or  more 
independent  variables  are  perfectly  correlated.  Fourth,  there  is  no  serial  correlation  of 
the  dependent  variable  among  the  cases.  Based  on  these  assumptions,  Probit  estimates 
the  parameters  of  the  linear  model  by  the  Maximum  Likelihood  Estimation  (MLE) 
procedures  (see  for  example  [Aldrich  and  Nelson  1985]  for  a  detailed  discussion). 

ID3  is  different  from  Probit  in  at  least  the  following  aspects.  First,  the  ID3 
algorithm  makes  no  assumption  on  data  distribution.  In  fact,  the  algorithm  treats 
continuous  variables  as  discrete  and  uses  a  recursive  decomposition  process  to  divide 
their  values  into  several  discrete  ranges.  Probit,  on  the  other  hand,  assumes  that  the 
relationship  between  the  dependent  and  independent  variables  is  a  cumulative  normal 
distribution  function.  Therefore,  it  seems  that  the  ID3  algorithm  is  more  appropriate 
when  the  normality  assumption  is  likely  to  be  violated,  and  Probit  is  more  appropriate 
otherwise. 

Second,  the  ID3  algorithm  generates  decision  tree  models  in  which  the  weakness 
of  a  factor  may  not  be  compensated  by  the  strength  of  the  other.  Probit  models, 
however,  assume  a  linear  compensatory  relationship  among  independent  variables.  This 
implies  that  ID3  may  be  more  appropriate  when  the  problem  involves  nominal  variables 
that  make  a  linear  model  inappropriate. 
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Third,    the    model    construction    process    of    ID3    is    essentially    an    exhaustive 

decomposition  process,  which  tries  to  cover  every  instance;  whereas  the  Probit  method 

focuses  on  optimizing  the  probability  of  correct  classification.     Therefore,  ID3  seems  to 

be  more  likely  to  overfit  the  sample  data  and  hence  may  be  more  sensitive  to  the  noise 

in  the  input  data  set. 

Finally,  the  entropy  function  is  a  logarithmic  function  generally  biased  toward 
variables  with  more  levels  and  against  variables  with  less  levels  (Mingers  1987).  In 
other  words,  variables  with  more  levels  are  more  likely  to  be  given  higher  priority  in  the 
model  construction  process.  Probit  models  do  not  have  this  bias  in  processing 
numerical  variables,  but  may  be  in  favor  of  attributes  with  less  levels  when  dummy 
variables  are  used  in  handling  nominal  variables. 

Given  these  differences,  it  would  be  interesting  to  know  whether  these  two 
methods  have  different  properties  when  they  are  applied  to  accounting  classification 
problems.  How  different  will  the  models  derived  from  different  methods?  Do  different 
models  have  different  internal  and  external  validities?  Which  method  is  better?  When 
and  why  does  a  particular  method  outperform  the  other?  In  the  remaining  sections,  we 
describe  two  experiments  investigating  these  issues  in  the  context  of  LIFO/FIFO 
choices. 

3.  BACKGROUND  OF  LIFO/FIFO  RESEARCH 

Choice  of  inventory  accounting  methods  has  been  a  research  issue  for  the  past 
decade.  Theoretically,  the  LIFO  method  has  tax  advantages  when  inflation  exists  and 
is  considered  more  attractive  than  the  FIFO  method.  In  practice,  however,  a  majority 
of  firms  still  adopt  FIFO  as  their  primary  inventory  accounting  method.  As  a  result, 
much  research  has  been  conducted  to  investigate  the  factors  affecting  the  adoption  of  a 
certain  method  (e.g.,  Biddle  [1980],  Cushing  and  LeClere  [19SS],  Dopuch  and  Pincus 
[19SS],  Lee  and  Hsieh  [19S5],  Morse  and  Richardson  [1983]). 
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Previous    literature    has    examined    at    least    three    potential    explanations    of 

LIFO/FIFO  choice:  Ricardian  costs,  agency  costs,  and  political  costs  (Lee  &  Hsieh 
1985).  The  Ricardian  hypothesis  assumes  that  the  inventory  method  choice  is  based  on 
a  firm's  comparative  advantage  in  tax  minimization  associated  with  the  production- 
investment  opportunity  set.  A  particular  method  (e.g.,  LIFO)  will  be  adopted  if  its  tax 
savings  exceed  the  implementation  costs.  Therefore,  LIFO  may  be  the  optimal  choice 
for  some  firms;  whereas  FIFO  is  the  optimal  choice  for  others.  The  agency  cost 
hypothesis  assumes  that  some  firms  remain  on  FIFO  to  report  higher  earnings  because 
of  managers'  concerns  about  the  impact  of  a  LIFO  switch  on  the  securities  market  or 
their  compensation  contracts  (e.g.,  Abdel-Khalik  1985;  Ricks  1982).  Managers  are 
willing  to  forego  potential  tax  savings  to  obtain  other  benefits.  The  political  costs 
hypothesis  assumes  that  a  method  will  be  chosen  if  its  political  costs  exceed  the 
potential  tax  savings.  For  example,  the  dominating  firm  in  an  industry  may  choose 
LIFO  to  reduce  its  reported  earnings  to  avoid  being  the  target  of  the  anti-trust  laws. 

Probit  has  been  the  major  method  used  in  previous  studies  to  test  these 
hypotheses.  Empirical  findings,  however,  are  inconclusive  in  many  aspects.  For 
example,  the  relative  frequency  of  price  increases  was  found  significantly  different 
between  LIFO  and  FIFO  firms  by  Lee  and  Hsieh  [1985];  but  the  effect  was  insignificant 
in  Dopuch  and  Pincus  [19SS].  The  inconsistency  in  previous  research  findings  may  be 
due  to  several  reasons. 

(1)  Data  effects  —  the  data  collected  for  hypothesis  testing  in  different  studies 
may  have  different  characteristics.  In  terms  of  long-term  LIFO  and  FIFO  firms,  for 
example,  Lee  and  Hsieh  (1985)  chose  firms  using  a  certain  method  consecutively  for 
more  than  seven  years;  whereas  Dopuch  and  Pincus  (19S8)  used  20  years  as  the 
criterion. 
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(2)  Variable  effects  —   variables   selected  for  examination  may  be  different   in 

different  studies.  There  are  usually  more  than  one  variable  that  can  be  used  to  test  a 
theory.  For  example,  both  net  sales  and  total  asset  can  be  used  as  surrogate  variables 
for  firm  size.  In  addition,  the  correlation  between  variables  may  make  it  difficult  to 
clearly  relate  the  significance  of  a  variable  to  a  single  theory. 

(3)  Method  effects  —  the  Probit  method  used  for  hypothesis  testing  may  have 
limitations  that  prevent  it  from  providing  unbiased  results.  In  generally,  there  are  three 
potential  biases  in  Probit  models.  First,  the  effect  of  nominal  variables  may  be 
underestimated.  Using  dummy  variables  to  handle  nominal  factors  dilutes  the  overall 
effect  of  the  factor.  This  bias  is  particularly  significant  in  a  multivariate  analysis  when 
the  nominal  variable  has  many  levels.  In  LIFO/FIFO  studies,  for  example,  industry 
type  was  found  significant  in  univariate  analysis  but  insignificant  in  multivariate 
analysis  (Lee  and  Hsieh  1985). 

Second,  the  linear  compensatory  model  assumption  may  be  inappropriate  in 
studying  LIFO/FIFO  decisions.  The  linear  compensatory  model  is  appropriate  only  if 
we  assume  that  the  manager  uses  a  weighted-sum  strategy  to  make  LIFO/FIFO 
decision.  Otherwise,  we  need  to  consider  other  functional  forms.  The  decision  tree 
model  derived  from  ID3  may  be  an  appropriate  alternative  form  for  other  strategies 
such  as  conjunctive  selection,  disjunctive  selection,  or  elimination  by  aspects. 

Third,  the  cumulative  normal  assumption  may  be  violated,  which  results  in 
unreliable  parameter  estimations.  There  arc  at  least  two  factors  that  may  cause  the 
violation  of  the  normality  assumption:  nature  of  data  and  training  sample  size.  When 
the  decision  is  primarily  affected  by  nominal  variables  or  the  data  distribution  is 
extremely  skewed,  the  normality  assumption  is  likely  to  be  violated.  When  the  size  of 
the  input  data  is  small,  the  normality  assumption  is  also  likely  to  be  violated.  Based  on 
the  discussion  in  the  previous  section,  the  ID3  algorithm  does  not  have  these  biases 


Page  11 
(although  it  certainly  may  have  some  other  biases)  and  can  be  a  promising  alternative 

to  Probit  in  investigating  the  inventory  accounting  decision. 

In  order  to  compare  the  ID3  and  Probit  methods,  we  conducted  two 
experiments.  In  the  first  study,  we  examined  the  data  and  variable  effects  of  Probit 
models,  and  compared  the  internal  validity  of  the  ID3  and  Probit  models.  In  the 
second  experiment,  we  used  hold-out  samples  to  examine  how  training  sample  size  and 
the  nature  of  data  affected  the  external  validity  of  these  methods. 

4.  THE  FIRST  EXPERIMENT 

The  first  experiment  focuses  on  comparing  the  models  resulting  from  ID3  and 
Probit.  Data  collected  from  the  COMPUSTAT  data  base  and  DRI  tape  are  analyzed 
by  both  Probit  and  ID3  methods.  The  results  are  then  compared  with  previous 
empirical  findings.  Our  primary  purpose  is  to  examine  the  methodological  issues  such 
as  the  variable  and  method  effects.  Therefore,  we  have  no  intention  of  arguing  whether 
previous  LIFO/FIFO  research  findings  are  appropriate. 
4.1  Data  Collection 

Data  collection  included  two  stages.     An  initial  data  base  consisting  of  eighteen 
variables  was  constructed.    This  data  base  was  then  used  to  compile  six  data  sets  for  the 
experiments. 
4.1.1  Initial  data  base 

Based  on  theories  and  previous  research  findings,  eighteen  explanatory  variables 
considered  having  effect  on  LIFO/FIFO  choices  were  selected,  which  included  one 
nominal  and  seventeen  numerical  variables.  This  set  of  variables  was  chosen  to  reflect 
the  following  concerns. 

(1)  Nature  of  industry  --  Some  industries  have  unique  environments  in  favor  of  a 
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certain  inventory  accounting  method.    Most  previous  research  uses  two-digit  SIC  codes 

to   represent    the   nature   of  industry.      This   variable   has    been   found   significant   in 

Eggleton,  Penman,  and  Twombly  (1976)  and  the  univariate  analysis  of  Lee  and  Hsieh 

(1985). 

(2)  Firm  size  —  The  benefits  of  using  LIFO  are  expected  to  be  more  significant 
for  larger  firms.  It  was  found  important  in  Morse  and  Richardson  (1983),  Abdel-Khalik 
(1985),  Cushing  and  LeClere  (1988),  and  Dopuch  and  Pincus  (1988).  Three  variables 
were  used  as  surrogates  for  firm  size  in  our  study:  net  sales,  net  income,  and  total 

assets. 

(3)  Inflation  and  its  variability—  Higher  and  stable  inflation  rates  arc  expected  to 
generate  higher  tax  benefits  from  using  LIFO.  We  used  the  average  growth  of  input 
price  to  measure  inflation  rate,  and  used  coefficient  of  variation  (CV)  of  input  price2 
and  CV  of  growth  of  input  price  to  measure  price  variability. 

(4)  Inventory  and  its  variability  --  A  stable  and  non-decreasing  inventory  level  is 
expected  to  generate  the  maximum  tax  savings  from  LIFO  adoption.  We  used  average 
inventory  to  measure  the  inventory  level  and  CV  of  inventory  to  measure  inventory 
variability.  In  general,  inventory  variability  may  be  affected  by  the  variabilities  of 
demand  and  production.  Firms  with  lower  demand  or  production  variability  more 
easily  maintain  a  stable  inventory  level.  We  used  net  sales  growth,  CV  of  net  sales 
growth,  and  relative  frequency  of  net  sales  growth  to  measure  demand  variability;  and 
used  CV  of  net  income  and  CV  of  net  sales  to  measure  the  operational  variability  of  a 
firm. 

(5)  Inventory  controllability  -  Tax  savings  obtained  from  using  LIFO  depends 
on  the  inventory  controllability  of  a  firm.  The  ability  to  control  inventory  is  a 
favorable  factor  for  a  firm  to  adopt  LIFO.    We  used  two  ratios,  inventory/net  sales  and 


Coefficient  of  Variation  (CV)  =  Standard  deviation  /  mean. 
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inventory/total  assets,  to  measure  the  inventory  controllability. 

(6)  Capital  intensity  --  Lee  and  Hsieh  (1985)  argue  that  capital-intensive  firms 
have  higher  fixed-to-variable-cost  ratios  and  should  have  a  stronger  incentive  to  use 
LIFO.    We  included  gross  capital  intensity  in  our  variable  set. 

(7)  Debt/equity  ratio  —  A  higher  debt/equity  ratio  may  force  the  firm's  manager 
to  increase  current  earnings  by  adopting  LIFO.  We  included  long-term  debt/equity 
ratio  as  its  surrogate  measure. 

Table  2  lists  the  seventeen  numerical  variables  and  indicates  those  tested  in  Lee 
&:  Hsieh  (1985)  and  Dopuch  and  Pincus  (1988).  Please  note  that  our  point  is  not  to 
determine  the  best  set  of  explanatory  variables  for  LIFO/FIFO  studies  but  to  develop  a 
set  of  LIFO/FIFO  data  on  which  the  impact  of  different  methodologies  can  be 
investigated. 


Insert  Table  2  Here 


After  determining  the  variables,  data  were  collected  from  the  COMPUSTAT 
database.  The  inflation  data  were  collected  from  the  DRI  tape.  Since  many  firms 
switched  from  FIFO  to  LIFO  in  1974  in  response  to  the  oil  crisis,  we  set  1975  as  the 
starting  year  to  obtain  samples.  The  criterion  for  selection  was  that  the  firms  must 
have  used  LIFO  or  FIFO  firms  consecutively  for  at  least  ten  years.  Initially,  220  FIFO 
firms  and  60  LIFO  firms  were  identified.  Three  of  them  were  later  eliminated  because 
of  missing  data.  These  firms  were  distributed  in  23  industries,  as  listed  in  Table  3. 
Table  4  shows  the  means  and  standard  deviations  of  LIFO  and  FIFO  firms  for  the 
numerical  variables. 


Insert  Tables  3  and  4  Here 
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4.1.2  Testing  data  sets 

Since  more  than  one  surrogate  variable  may  reflect  the  same  theoretical  factor  in 
our  initial  data  base,  high  correlations  among  them  exist.  In  order  to  test  the  variable 
and  method  effects  in  classification  research,  we  compiled  six  data  sets  of  different 
variables  from  the  intial  data  set.  Each  resulting  data  set  still  has  280  cases.  The 
procedures  for  composing  these  data  sets  are  as  follows.  First,  three  sets  of  data  with 
eight  numerical  variables  each  were  selected  after  considering  the  multicollinearity 
issue.  This  allowed  us  to  examine  the  effect  of  using  different  surrogate  variables  in 
model  construction.  Second,  the  nominal  variable  industry  type  was  added  to  the  three 
sets  to  form  another  three  data  sets.  This  allowed  us  to  examine  the  effect  of  nominal 
variables  in  model  construction.    Table  5  lists  the  variables  included  in  each  data  set. 


Insert  Table  5  Here 


4.2  Data  Analysis 

For  each  data  set,  two  analyses  were  applied  to  construct  models  from  data. 
First,  Probit  was  applied  to  examine  the  effect  of  including  different  variables  on 
hypothesis  testing.  The  results,  as  shown  in  Tables  6  and  7,  indicate  that  the  variable 
effect  does  exist  when  Probit  is  applied.  For  example,  long-term  debt/equity  is 
significant  in  model  1  but  insignificant  in  models  2  and  3.  In  addition,  when  CV  of  net 
sales  was  replaced  by  CV  of  net  sales  growth,  the  significance  levels  of  CV  of  inventory 
reduced  (models  1  and  3  in  Table  6). 


Insert  Tables  6  and  7  Here 


Another  effect  we  observed  is  the  impact  of  nominal  variables.     By  comparing 
the  models  in  Tables  6  and  7,  we  find  that  three  variables  becomes  significant  because 
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of  the  existence  of  a  nominal  variable  (industry  type):  net  sales,  long-term  debt/equity, 

and  growth  of  input  price.    However,  the  significance  of  gross  capital  intensity  decreases 

(models  1  and  3  in  Tables  6  and  7).     All  the  dummy  variables  for  different  industries 

were  not  statistically  significant.     In  summary,  the  results  in  Tables  6  and  7  suggest 

that  the  addition  or  deletion  of  a  variable  may  change  the  significance  levels  of  other 

variables  and  hence  affect  the  reliability  of  hypothesis  testing. 

In  the  second  analysis,  ID3  method  was  applied  to  the  data  sets  that  included 
industry  type3.  The  resulting  decision-tree  models  are  shown  in  Figures  2,  3,  and  4. 
Assuming  the  variables  included  in  the  decision  rules  to  be  significant  factors,  we  find 
three  differences  between  Probit  and  ID3  models.  First,  the  factors  selected  by  the 
different  methods  were  different.  For  example,  industry  type  was  considered  the  most 
significant  one  in  ID3  models  but  insignificant  in  Probit  models.  Inventory/net  sales 
was  very  significant  in  Probit  model  (model  1  in  Table  7);  but  only  appeared  in 
industries  2600,  3600,  and  3700  for  the  ID3  method.  Second,  different  factors  were 
identified  by  ID3  for  different  industries.  For  instance,  long-term  debt/equity  was 
found  important  in  printing,  publishing,  and  allied  industries  (SIC  code  2700),  but 
irrelevant  in  the  lumber  (2400)  or  chemical  (2800)  industries.  This  implies  that  ID3  is 
capable  of  identifying  the  industry-specific  nature  of  inventory  accounting  choices. 
Third,  the  ID3  models  are  relatively  less  sensitive  to  the  addition  or  deletion  of 
variables.    A  large  portion  of  the  decision  trees  remains  the  same  in  Figures  2,  3,  and  4. 


Insert  Figures  2,  3,  and  4  Here 


In  addition  to  the  differences  in  model  format  and  variables  included  in  a  model, 
the  classification  accuracy  of  the  resulting  models  is  also  important.     Table  8  shows  a 


3The  software  used  to  run  the  ID3  algorithm  is  called  ACLS,  which  stands  for 
Analog  Concept  Learning  System. 
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comparison  of  the  classification  accuracy  between  the  models  constructed  by  Probit  and 

ID3.    Here  the  classification  accuracy  is  measured  by  the  percentage  of  the  cases  in  the 

input   data  set   that   is  correctly  classified  by  the  model.      Generally  speaking,  Probit 

models  with  industry  type  outperformed  those  without  industry  type;  and  ID3  models 

outperformed  Probit  models  in  terms  of  the  percentage  of  firms  correctly  classified. 

Since    the    ID3    algorithm    tries    to    cover    all    sample    data   in    the    process    of   model 

construction,    the   perfect    classification   is   no   surprise.      In  fact,    this   level   is   usualhr 

achieved  unless  conflicting  data  exist  in  the  samples.     A  potential  problem  associated 

with  the  high  classification  accuracy  of  ID3  is  that  it  may  overfit  the  input  data  and 

hence  may  be  heavily  influenced  by  the  noise  present  in  the  input  data  set.    Therefore. 

it  is  necessary  to  conduct  another  experiment  to  compare  the  prediction  accuracy,  i.e., 

the  accuracy  when  the  models  are  applied  to  hold-out  samples,  and  the  circumstances  in 

which  a  particular  method  is  more  appropriate. 


Insert  Table  8  Here 


5.  THE  SECOND  EXPERIMENT 


The  second  experiment  uses  hold-out  samples  to  compare  the  external  validity  of 
Probit  and  ID3.  In  order  to  examine  the  situations  where  a  particular  method  is  better, 
two  factors  that  may  affect  the  applicability  of  a  particular  method  were  investigated: 
nature  of  the  data  set  and  training  sample  size.  The  experimental  design  included  three 
independent  variables:  data  analysis  method  (METHOD),  characteristics  of  the  data  set 
(DATA),  and  training  sample  size  (SIZE).  They  were  organized  into  a  2*2  >3  factorial 
design. 

The  methods  investigated  were  Probit  and  ID3.  The  characteristics  of  the  data 
set  also  had  two  levels:  one  was  dominated  bv  a  nominal  variable,  the  other  was  not.    A 
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data  set   is   said   to  be  dominanted  by   a  nominal  variable  if  the  variable  alone  can 

correctly   classify   a  significant   portion   (e.g.,   70%)   of  the  input   cases.      The  training 

sample  sizes  included   three  levels:   large/small   (L/S),   medium/medium   (M/M),   and 

small/large  (S/L).  Large/small  means  using  a  large  portion  of  the  samples  to  derive  the 

model  for  predicting  a  small  number  of  holdouts.     Medium/mediun  means  using  about 

half  of  the  samples  to  predict  another  half.    Small/large  means  using  a  small  portion  of 

the  samples  to  predict  the  remaining  samples. 

The  dependent  variable  was  the  prediction  accuracy  of  the  model  derived  in  a 
particular  setting.    It  was  defined  as  follows: 

#  of  hold-out  cases  correctly  predicted 
Prediction  Accuracy  =    


Total  #  of  hold-out  cases 

The  hypotheses  tested  in  this  experiment  can  be  formulated  as  follows. 

(1)  Effect  of  data  characteristics 

Since  Probit  and  ID3  are  substantially  different  in  many  aspects,  we  anticipate 
that  they  will  have  different  performance  in  analyzing  different  types  of  data.  In 
particular,  we  expect  ID3  to  perform  better  when  a  nominal  factor  has  significant  effect 
on  the  decision  outcome  and  Probit  to  perform  better  otherwise.    That  is, 

Hl.l:  In  a  situation  where  actual  classification  is  dominated  bv  a  nominal 

•j 

variable,  ID3  performs  better  than  Probit. 

HI. 2:  In  a  situation  where  actual  classification  is  not  dominated  by  a  nominal 
variable,  Probit  performs  better  than  ID3. 

(2)  Effect  of  training  sample  size 

The  normality  assumption  usually  is  true  only  when  the  training  sample  size  is 
large.  Since  ID3  makes  no  assumption  on  data  distribution,  we  expect  ID3  to  be  less 
sensitive  to  the  decrease  of  sample  size.    That  is, 


Page  IS 


H2:     The  decrease  in  the  size  of  training  sample  set  has  more  effect  on  Probit 
than  on  ID3. 
5.1.  Data  Collection 

The  data  sets  used  to  test  the  hypotheses  were  colic  <  tod  through  a  two-step 
process.  First,  two  sets  of  data  with  different  characteristics  were  compiled  from  the 
initial  data  set  constructed  in  the  pilot  study.  One  was  composed  of  firms  in  the 
industries  not  dominated  by  a  particular  inventory  accounting  method,  whereas  the 
other  consisted  of  firms  in  the  industries  dominated  by  a  single  method.  They 
represented  different  effects  of  the  nominal  variable  industry  type.  The  effect  of  the 
industry  SIC  code  was  relatively  low  in  the  first  set  and  high  in  the  second.  The  degree 
of  industry  dominance  used  to  differentiate  these  two  sets  was  3/4.  In  other  words, 
industries  with  more  than  three-fourths  of  their  firms  using  the  same  method  were 
classified  as  industry-dominated  (DOM).  The  rest  were  classified  as  non-industry- 
dominated  (NDOM).  If  we  define  the  degree  of  industry  dominance  as  the  percentage 
of  the  firms  in  the  data  set  whose  actual  inventory  method  can  be  correctly  classified  by 
observing  the  industry  type  only,  these  two  data  sets  have  different  degrees  of  industry 
dominance.  They  are  67.5%  and  99.4%  respectively.  The  industries  with  less  than  five 
firms  in  the  original  data  set  were  eliminated  to  avoid  potential  biases  in  the  next  stage 
of  the  experiment.  Table  9  lists  the  two-digit  SIC  codes  and  number  of  firms  included 
in  these  data  sets. 


Insert  Table  9  Here 


In  the  second  step  of  the  process,  thirty  data  subsets  with  three  different  sizes 
were  randomly  sampled  from  each  of  the  two  data  sets.  The  sample  sizes  of  these 
subsets  were  divided  into  three  levels:   large,  medium,  and  small.     The  large  subset 
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included  roughly  two-third  of  the  samples,  the  medium  subset  included  one-half  of  the 

samples,  and  the  small  subset  included  one-third  of  the  samples  in  the  data  set.  Table 
10  shows  the  sample  sizes  of  these  subsets.  Since  the  industry-dominated  and  non- 
dominated  data  sets  had  different  number  of  firms,  their  subsets  also  had  different 
number  of  firms.  These  subsets  were  used  as  training  data  from  which  decision  models 
were  derived.  The  samples  not  included  in  a  training  subset  formed  a  counterpart,  a 
testing  subset,  for  evaluating  the  prediction  accuracy  of  the  model  derived  from  the 
training  subset. 


Insert  Table  10  Here 


5.2.  Data  Analysis 

For  each  pair  of  training  and  testing  data  subsets,  the  following  analysis  was 
performed.  First,  Probit  was  used  to  derive  a  linear  model  from  the  training  set. 
Second,  the  model  was  used  to  predict  the  LIFO/FIFO  choices  of  the  firms  in  the 
testing  set  and  to  calculate  the  prediction  accuracy  of  the  model.  Third,  ID3  was  used 
to  analyze  the  same  training  data  sets  and  derive  decision-tree  models.  Fourth,  the 
resulting  models  were  used  to  predict  the  corresponding  testing  data  sets  to  provide 
comparable  results. 

This  analysis  was  conducted  over  all  sixty  pairs  of  data  subsets.  Table  11  shows 
the  means  and  variances  of  prediction  accuracy  under  various  settings.  Table  11- (a) 
shows  the  statistics  involving  a  single  factor.  Table  ll-(b)  shows  the  statistics  involving 
the  interaction  of  two  factors  (SIZE-DATA  and  METHOD*DATA).  Table  ll-(c) 
shows  the  statistics  involving  the  interaction  of  all  three  factors.  The  average  prediction 
accuracy  ranges  from  0.60SS  to  0.9000. 
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Insert  Table  11  Here 


One-way,  two-way,  and  three-way  analyses  of  variance  (ANOVA)  were 
performed  to  test  the  hypotheses.  The  results  of  one-way  ANOVA,  as  illustrated  in 
Table  12,  show  that  DATA  (the  characteristics  of  the  data  set)  had  significant  effect  on 
the  prediction  accuracy  (p=0.01%,  R  =  0.7062).  Both  methods  performed  better  in 
dealing  with  DOM  (the  industry-dominated  data  set).  This  result  is  no  surprise.  It 
could  be  because  that  DOM  was  less  noisy.  For  example,  the  degree  of  industry 
dominance  was  by  definition  much  higher  in  DOM  than  in  NDOM,  which  increased  the 
prediction  accuracy.  The  result  indicates  that  the  less  noisy  a  data  set  is,  the  more 
accurate  the  resulting  model  will  be.  The  effects  of  METHOD  and  SIZE  were  not 
significant  at  the  5%  level. 


Insert  Table  12  Here 


Since  the  insignificance  of  METHOD  and  SIZE  could  be  attributed  to  the 
overwhelming  DATA  effect,  a  two-way  ANOVA  was  conducted  on  DOM  and  NDOM 
data  sets  separately.  The  results,  as  shown  in  Table  13,  indicate  that  METHOD  was 
significant  in  both  DOM  (p=4.93%)  and  NDOM  (p=0.01%),  whereas  SIZE  and  the 
interaction  of  SIZE  and  METHOD  were  significant  in  DOM  only  (p=  0.17%  and 
p=2.53%  respectively).  Combining  these  findings  and  the  descriptive  statistics  in  Table 
ll-(b),  we  found  that  the  ID3  algorithm  outperformed  Probit  in  DOM  (0.S910  versus 
0.8633)  but  was  significantly  worse  in  NDOM  (0.6192  versus  0.7244).  This  confirms  the 
hypotheses  on  data  characteristics.  111.  I  and  HI. 2.  In  DOM,  the  prediction  accuracy 
decreased  significantly  (p=0.17%)  when  rhe  sample  size  decreased.  The  same  trend  was 
observed  in  NDOM,  but  the  effect  was  not  significant  at  the  5%  level  (p=7.14%).    The 
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significance  of  SIZE  *  METHOD  in  DOM  indicate  that  the  reduction  of  sample  size  had 

different  effect  on  different  methods.     In  order  to  further  understand  the  details  of  the 

interaction  among  factors,  a  three-way  ANOVA  was  conducted. 


Insert  Table  13  Here 


Table  14  illustrates  the  results  of  the  three-way  ANOVA.  The  main  effects  of  all 
three  factors  became  significant  in  the  analysis.  In  other  words,  they  all  had  significant 
effect  on  the  prediction  accuracy  of  the  derived  model.  In  addition,  the  interactions  of 
SIZE  and  METHOD  and  of  METHOD  and  DATA  were  also  significant  at  p=3.S9%  and 
0.01%  respectively.  The  effect  of  SIZE  *  DATA  and  the  interaction  of  the  three  factors 
were  not  significant  at  the  5%  level,  but  the  latter  was  significant  at  10%  level 
(p=8.47%). 


Insert  Table  14  Here 


The  significance  of  the  interaction  between  SIZE  and  METHOD  again  supports 
our  previous  argument  that  the  reduction  of  sample  size  had  a  different  effect  on  both 
methods.  Combining  this  result  with  the  statistics  in  Table  ll-(c),  we  found  that  the 
prediction  accuracy  had  two  sharp  decreases.  In  DOM.  its  accuracy  decreased  from 
0.7666  in  L/S  to  0.7000  in  M/M.  In  NDOM.  the  accuracy  decreased  from  0.S91S  in 
M/M  to  0.S092  in  S/L.  This  effect,  as  portrayed  in  Figure  5,  was  not  seen  in  the  ID3 
case,  although  the  accuracy  did  reduce  slightly  when  the  training  sample  size  decreased. 
The  result  confirms  hypothesis  H2  that  Probit  is  more  sensitive  to  the  reduction  of 
training  sample  size.  The  significance  of  METHOD  *  DATA  indicates  that  the 
characteristics  of  data  set  affected  the  prediction  accuracy.  This  is  consistent  with  the 
results  of  two-way  ANOVA  that  supports  hypotheses  Hl.l  and  HI. 2. 
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Insert  Figure  5  Here 


6.  IMPLICATIONS  AND  CONCLUSION 
In  this  paper,  we  have  presented  a  non-parametric  method  for  accounting 
research  and  two  experiments  examining  some  methodological  issues.  In  the  first 
experiment,  we  found  that  selection  of  variables  affected  the  significance  of  variables 
and  hence  the  interpretation  of  research  findings.  In  order  to  reduce  inconsistent 
findings  in  classification  research,  therefore,  special  attention  must  be  paid  to  the 
selection  of  variables  to  be  included  in  a  study.  In  addition,  if  previous  findings  in 
different  research  are  to  be  compared,  the  effect  due  to  variable  selection  must  be 
considered. 

In  the  second  experiment,  we  found  that  data  characteristics,  training  sample 
size,  and  data  analysis  method  had  significant  effect  on  the  performance  of  the  resulting 
model  of  LIFO/FIFO  choice.  In  addition,  the  interaction  between  data  characteristics 
and  method,  and  between  sample  size  and  method  were  also  significant.  The 
implications  of  these  findings  are  two-fold. 

First,  concerning  research  on  LIFO/FIFO  choice,  the  effect  of  different  data 
analysis  methods  and  the  dominance  of  industry  SIC-code  need  to  be  investigated.  As 
observed  in  the  first  study,  the  industry  SIC-code  was  considered  the  most  important 
factor  in  the  decision  model  derived  by  ID3,  but  was  insignificant  in  the  model  derived 
by  Probit.  Most  previous  research  adopted  Probit  and  tended  to  seek  firm-specific 
economic  reasons  to  explain  the  LIFO/FIFO  decision.  This  may  have  been  subject  to 
the  limitation  of  Probit  in  handling  discrete  variables,  as  discussed  in  Sections  2  and  3. 
Therefore,  studies  using  methods  different  from  Probit  or  focused  on  industry-level  that 
use  either  industry-specific  data  or  data  aggregated  by  industry  will  be  desirable. 
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Second,    concerning   accounting   classification   research   in   general,    classification 

algorithms   other    than    statistical   methods   may   provide   more   reliable   results   under 

certain  circumstances.    Researchers  need  to  consider  all  alternative  methods  available  in 

order  to  increase  the  reliability  of  the  results.    The  ID3  algorithm  studied  in  this  article 

is  only  a  representative  of  AI  methods.    Other  algorithms  and  new  versions  of  ID3  may 

provide   different   results.      Of  course,   there  is   no  one   universally   best   methodology. 

Therefore,    selection   of  methodology   becomes   very   important    to   the   validity  of  the 

results. 

If  a  choice  is  to  be  made  between  Probit  and  ID3,  data  characteristics  and 
sample  size  are  two  major  factors  that  need  to  be  considered.  In  general,  Probit 
performs  better  when  the  effect  of  nominal  variables  in  the  data  set  is  less  significant 
and  ID3  performs  better  otherwise.  This  is  due  to  their  assumptions  on  data 
distribution  and  criteria  for  constructing  decision  models.  The  normality  assumption  of 
Probit  makes  it  more  sensitive  to  the  decrease  of  sample  size  and  difficult  to  handle 
nominal  variables.  Its  hurdle  level,  where  a  sharp  decrease  of  accuracy  occurs,  is  higher 
than  that  of  ID3.  Lack  of  the  normality  assumption  in  ID3,  however,  causes  its  poor 
performance  in  handling  large  number  of  samples  with  dominant  continuous  variables; 
but  its  repetitive  decomposition  algorithm  allows  it  to  handle  nominal  variables  well. 

From  this  brief  analysis,  we  have  compared  the  effect  of  using  Probit  and  ID3  in 
studying  the  LIFO/FIFO  choice  and  shown  that  Probit  and  ID3  are  complementary 
methods  for  accounting  classification  research.  Due  to  the  exploratory  nature  of  the 
work  and  the  complexity  of  the  issue,  further  research  needs  to  be  conducted  to  fully 
understand  the  choice  of  methodology  for  accounting  research.  Directions  include  at 
least  the  following: 

(1)  Other  data  characteristics.  In  this  work,  we  only  examined  the  degree  of 
dominance  of  a  single  nominal  variable.     The  cases  of  multiple  nominal  variables  and 
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other  criteria  for  classifying  data  characteristics  will  need  to  be  investigated. 

(2)  Other  accounting  problems.  The  results  were  obtained  from  the  LIFO/FIFO 
choice  data.  Further  work  may  be  done  in  studying  other  accounting  classification 
problems.  Bankruptcy  prediction  from  financial  reports,  for  example,  may  also  include 
both  nominal  and  numerical  variables  and  have  similar  effects. 

(3)  Other  methodologies.  As  stated  previously,  ID3  is  only  a  representative  of 
AI  methods.  There  are  other  AI  methods,  such  as  Michalski's  AQ  approach  (Michalski 
and  Chilausky,  1980),  and  algorithms  outside  AI  area  that  may  also  be  useful  for 
accounting  research  and  need  to  be  examined. 
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Industry  Type 

Net  Sales 

Accounting  Method 

Lumber 

200 

FIFO 

Lumber 

152 

FIFO 

Lumber 

312 

FIFO 

Lumber 

600 

FIFO 

Lumber 

63 

FIFO 

Lumber 

400 

FIFO 

Metal 

1000 

FIFO 

Metal 

500 

LIFO 

Metal 

1521 

LIFO 

Metal 

2300 

LIFO 

Metal 

1420 

LIFO 

Metal 

650 

LIFO 

Metal 

2000 

LIFO 

Metal 

1500 

LIFO 

Note:  1.  Net  sales  are  in  millions  of  dollars. 


Table  1.  A  Sample  Data  Set 


Variable  Name 

This 

Lee  & 

Dopuch&: 

Paper 

Hsieh 

Pincus 

Net  sales 

* 

* 

* 

Total  assets 

* 

* 

CV  of  net  sales 

* 

Relative  frequency  of  sales  grow 

•th   * 

Net  sales  growth 

* 

CV  of  net  sales  growth 

* 

Inventory 

* 

CV  of  inventorv 

* 

* 

* 

Inventory/Net  sales 

* 

* 

* 

Inventory/Total  assets 

* 

* 

* 

Net  income 

* 

CV  of  net  income 

* 

Long-term  debt/Equity 

* 

* 

* 

Gross  capital  intensity 

* 

* 

* 

CV  of  input  price 

* 

* 

Growth  of  input  price 

* 

* 

CV  of  growth  of  input  price 

* 

se 

Table  2.  Numerical  variables  Included  in  the  Initial  Data  Set 


SIC  CODE 

DESCRIPTION           FIFO  FIRMS 

LIFO  FIRMS 

2000 

FOOD  AND  KINDRED  PRODUCTS     6 

0 

2200 

TEXTILE  MILL  PRODUCTS        3 

3 

2300 

APPAREL  AND  OTHER  FINISHED 

2400 

2500 
2600 
2700 

2800 

2900 

3000 

3100 
3200 

"300 


3500 


3600 


3700 
3800 


3900 

5000 

5100 

5300 
5900 


14 

5 
I 
3 

10 

13 


PRODUCTS  MADE  FROM  FABRICS 

AND  SIMILAR  MATERIALS 

LUMBER  AND  WOOD  PRODUCTS, 

EXCEPT  FURNITURE 

FURNITURES  AND  FIXTURES 

PAPER  AND  ALLIED  PRODUCTS 

PRINTING,  PUBLISHING,  AND 

ALLIED  INDUSTRIES 

CHEMICALS  AND  ALLIED 

PRODUCTS 

PETROLEUM  REFINING  AND 

RELATED  INDUSTRIES 

RUBBER  AND  MISCELLANEOUS 

PLASTIC  PRODUCTS 

LEATHER  AND  LEATHER  PRODUCTS 

STONE,  CLAY,  GLASS,  AND 

CONCRETE  PRODUCTS 

PRIMARY  METAL  INDUSTRIES 

FABRICATED  METAL  PRODUCTS, 

EXCEPT  MACHINERY  AND 

TRANSPORTATION  EQUIPMENT 

INDUSTRIAL  AND  COMMERCIAL 

MACHINERY  AND  COMPUTER 

EQUIPMENT 

ELECTRONIC  AND  OTHER  ELECTRICAL 

EQUIPMENT  AND  COMPONENTS 

EXCEPT  COMPUTER  EQUIPMENT 

TRANSPORTATION  EQUIPMENT 

MEASURING,  ANALYZING,  AND 

CONTROLLING  INSTRUMENTS; 

PHOTOGRAPHIC,  MEDICAL  AND 

OPTICAL  GOODS;  WATCHES  AND 

CLOCKS 

MISCELLANEOUS  MANUFACTURING 

INDUSTRIES 

WHOLESALE  TRADE -DURABLE 

GOODS 

WHOLESALE  TRADE  -  NONDURABLE 

GOODS 

GENERAL  MERCHANDISE   STORES 

MISCELLANEOUS   RETAILS 


8 


14 


60 

15 


20 


12 

11 
1 
6 


1 
0 
2 

3 

4 

3 

4 
1 

2 

7 


3 
1 


2 

2 

4 

1 
0 
3 


TOTAL 


217 


60 


Table  3.  Distribution  of  Sample  Firms 


FIFO  FIRMS 

LIFO  FIRMS 

VARIABLES 

MEANS 

ST.  DEV 

MEANS 

ST.  DEV 

Net  sales 

$341M 

$885M 

$1,247M 

$3,079M 

CV  of  net  sales 

.3954 

.2073 

.3089 

.1114 

Net  sales  growth 

.1410 

.1165 

.0992 

.1114 

CV  of  net  sales 

growth 

3.723 

11.36 

3.155 

7.418 

Relative  frequency 

of  sales  growth 

.7839 

.1798 

.7852 

.1556 

Total  assets 

$220M 

$650M 

$1,023M 

$2,781M 

Inventory 

$   69M 

$  171M 

$    148M 

$    315M 

CV  of  inventory 

.4144 

.2265 

.2776 

.1164 

Net  income 

$    30M 

$    97M 

$    92M 

$  284M 

CV  of  net  income 

.4498 

20.516 

1.051 

7.2037 

Long-term  debt/ 

Equity 

.5094 

.7222 

.3643 

.2637 

Inventory/net 

sales 

.2126 

.0910 

.1636 

.0727 

Inventory/total 

assets 

.3081 

.1211 

.2627 

.1263 

Gross  capital 

intensity 

.3141 

.2164 

.4649 

.2SS0 

CV  of  input  price 

.1961 

.0304 

.20S6 

.0542 

Growth  of  input 

price 

.0679 

.0132 

.0734 

.0171 

CV  of  growth  of 

input  price 

.6285 

.4146 

.6230 

.3096 

Table  4.  Means  and  Standard  Deviations  for  LIFO  and  FIFO  firms 
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Variable  Name  Model  1       Model  2       Model  3 

(Data  set  4)  (Data  set  5)  (Data  set  6) 

Industry  type  Coefficients  <  0.01  &;  insignificant 

Net  sales  '  1.781*  —  1.783* 

CV  of  net  sales  .993  1.291 

CV  of  net  sales  growth  -1.119 

Total  assets 

CV  of  inventory 

Long-term  debt/equity 

Inventory/net  sales 

Inventory/total  assets 

Gross  capital  intensity 

Growth  of  input  price 

CV  of  growth  of  input  price 

Log  Likelihood  Ratio  127.27  115.94  128.30 


Note:  *  ...  Significant  at  least  at  5%  level. 
**  ...  Significant  at  least  at  1%  level. 
***  .  Significant  at  least  at  .1%  level. 


— 

1.450 

— 

-2.742*** 

-3.082*** 

-3.154*** 

-3.188*** 

-3.017*** 

-2.837*** 

-3.193*** 

— 

-3.221*** 

— 

-1.097 

— 

1.471 

1.254 

1.331 

2.191* 

2.121* 

2.319* 

1.328 

1.333 

1.3S0 

Table  7.  Models  Derived  From  Data  Sets  Including  Industry  Type 


Situation 

Model  1 

Model  2 

Model  3 

Probit  Method 
(No  Industry  Type) 

83.03 

81.59 

82.67 

Probit  Method 
(With  Industry  Type) 

86.28 

87.00 

S5.92 

ID3  Method 

(With  Industry  Type) 

100.0 

100.0 

100.0 

Table  8.  Percentage  of  Correct  Classification 


Non-industry-dominated 

Industry-dominated 

Sic    code 

FIFO 

LIFO 

SIC    code 

FIFO 

LIFO 

22 
26 
27 
28 
30 
34 
35 
39 
50 
59 

Total 

3 

3 

10 

13 

3 

8 

14 

6 

12 

6 

78 

3 
2 
3 
4 
4 
7 
7 
2 
4 
3 

39 

20 
23 
24 
33 
36 
37 
38 
51 

Total 

6 

14 

5 

1 

60 

15 

20 

11 

132 

0 
0 
1 
7 
3 
1 
2 
1 

15 

Table  9.  Composition  of  Two  Data  Sets  for  The  Second  Experiment 


Non- industry-dominated 

Industry-dominated 

Train 

Test 

Total 

Train 

Test 

Total 

Large 

Medium* 

Small 

78 
58 
39 

39 
58 
78 

117 
116 
117 

98 
78 
49 

49 

78 
98 

147 
146 
147 

*  One  was  randomly  held  out  to  make  the  training 
and  testing  sample  sizes  equal. 


Table  10.  Sizes  of  the  Training  and  Testing  Data  Subsets 


Factor 

Level 

N 

Mean 

Variance 

SIZE 

L/S 
M/M 
S/L 

40 
40 
40 

.7969 
.7771 
.7518 

.0150 
.0155 
.0150 

METHOD 

ID3 
Probit 

60 
60 

.7552 
.7953 

.0209 
.0091 

DATA 

NDOM 
DOM 

60 
60 

.6718 
,8787 

.0060 
.0031 

(a)  Means  and  Variances  by  Factor  Levels 


Factor 

Level 

NDOM 

DOM 

N 

Mean 

Variance 

N 

Mean 

Variance 

L/S 

20 

.6948 

.0077 

20 

.8990 

.0012 

SIZE 

M/M 

20 

.6630 

.0035 

20 

.8911 

.0010 

S/L 

20 

.6576 

.0065 

20 

.8459 

.0056 

ID3 

30 

.6193 

.0033 

30 

.8910 

.0011 

METHOD 

Probit 

30 

.7244 

.0031 

30 

.8663 

.0049 

(b)  Means  and  Variances  by  Factor  Levels  and 
Data  Sets 


SIZE 

METHOD 

NDOM 

DOM 

N 

Mean 

Variance 

N 

Mean 

Variance 

L/S 

ID3 
Probit 

10 

10 

.6230 

.7666 

.0028 
.0020 

10 
10 

.9000 
.8980 

.0010 
.0016 

M/M 

ID3 
Probit 

10 
10 

.6261 
.7000 

.0016 
.0027 

10 
10 

.8940 
.8918 

.0012 
.0010 

S/L 

ID3 
Probit 

10 
10 

.6088 
.7064 

.0060 
.0024 

10 
10 

.8827 
.8092 

.0011 
.0078 

(c)  Means  and  Variances  by  Experimental  Cells 


Table  11.  Means  and  Variances  of  the  Prediction  Accuracy 


Factor     Source  DF  SS  MS        F            P(%)        R2 

SIZE           Model  2  .0410  .0205    1.35         26.36     .0225 

Error  117  1.7764  .0152 

Total  119  1.8174 

METHOD  Model  1  .0484  .0484    3.23           7.5        .0266 

Error  118  1.7690  .0150 

Total  119  1.8174 

DATA        Model  1  1.2835  1.2835  283.7**      .01       .7062 

Error  118  .5339  .0045 

Total  119  1.8174 


*  ...  Significant  at  .01%  level 


Table  12.  One-way  ANOVA  Results 


Source  DF       SS  MS       F  P  (7o)       R2 


SIZE  2  .0162  .0081     2.8         7.14       .5524 

METHOD  1  .1655  .1655   56.8**       .01 

SIZE*METHOD  2  .0126  .0063     2.2         12.55 

ERROR  54  .1574  .0029 

TOTAL  59  .3517 


**  ...  Significant  at  .01%  level 
(a)  Two-way  ANOVA  on  the  Non-industry-dominated  Data 


Source  DF      SS  MS       F  P  (%)      R2 


SIZE  2  .0328  .0164     7.23**     .17       .32S2 

METHOD  1  .0092  .0092     4.04*     4.93 

SIZE*METHOD       2  .0179  .0089     3.94*     2.53 

ERROR  54  .1224  .0023 

TOTAL  59  .1822 


*  ...  Significant  at  5%  level 
**  ...  Significant  at  1%  level 


(b)  Two-way  ANOVA  on  the  Industry-dominated  Data 


Table  13.  Two-way  ANOVA  Results 


Source 

DF 

SS 

MS       F 

P(%) 

R2 

SIZE 

2 

.0410 

.0205     7.9** 

.06 

.8460 

METHOD 

1 

.0484 

.0484     18.7** 

.01 

DATA 

1 

1.2835  1 

L.2835  495.4** 

.01 

SIZE*METHOD 

2 

.0173 

.0087     3.4* 

3.S9 

SIZE*DATA 

o 

.OOSO 

.0040     1.5 

21.85 

METHOD*DATA 

1 

.1263 

.1263  48.75** 

.01 

SIZE*METHOD* 

2 

.0131 

.0066     2.53 

8.47 

DATA 

ERROR 

103 

.2798 

.0026 

TOTAL 

119 

1.8174 

*  ...  Significant  at  5%  level 
**  ...  Significant  at  .1%  level 


Table  14.  Three-way  ANOVA  Results 
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Figure  1.  A  Sample  Decision  Tree 


Figure  2.  Decision  Tree  for  Model  1 
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Figure  2.  Decision  tree  for  Model  1   (cont'd) 
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Figure  2.  Decision  Tree  for  Model  1   (conclusion) 


Figure  3.  Decision  Tree  for  Model  2 
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Figure  3.  Decision  Tree  for  Model  2  (cont'd) 
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Figure  3.  Decision  Tree  for  Model  2  (conclusion) 
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Figure  4.  Decision  Tree  for  Model  3 
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Figure  4.  Decision  Tree  for  Model  3  (cont'd) 
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Figure  4.  Decision  Tree  for  Model  3  (conclusion) 
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FIGURE  5.  prediction  Accuracies  of  ID3  and  Probit 


